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We present a parallel implementation of cellular automata to simulate chemical reactions on 
surfaces. The scaling of the computer time with the number of processors for this parallel im- 
plementation is quite close to the ideal T/P, where T is the computer time used for one single 
processor and P the number of processors. Two examples are presented to test the algorithm, the 
simple A+B^ model and a realistic model for CO oxidation on Pt(110). By using large parallel 
simulations, it is possible to derive scaling laws which allow us to extrapolate to even larger system 
sizes and faster diffusion coefficients allowing us to make direct comparisons with experiments. 

PACS numbers: 82.65.+r; 82.20.Wt; 02.70.Tt; 82.40.Np; 89.75.Da 



I. INTRODUCTION 

One of the most interesting features of surface reac- 
tions is that in a large number of cases produce pat- 
tern formation, structures with some well-defined length 
scale, sometimes with symmetries and temporal behav- 
ior, such as oscillations, traveling waves, spirals, Turing 
patterns, etc @, §|. A usual approach to study this 
pattern formation is reaction-diffusion (RD) equations 
H , which simulate the dynamic behavior of chemical re- 
actions on surfaces. However, these partial differential 
equations give only approximate solutions and in several 
cases completely wrong results, because they are based on 
the local mean field approximation, meaning well-mixed 
reactants at microscopic level, ignoring all the correla- 
tion terms between reactants locally, fluctuations and lat- 
eral interactions in the adsorbate Q . The RD equations 
describe the coverage which are macroscopic continuum 
variables which neglect the discrete structure of matter, 
and do not describe the actual chemical process underly- 
ing pattern formation. In fact from experimental studies 
j| it is known that to model correctly, a modified kinetic 
has to be assumed different from the prescribed for RD. 

Based in some general assumptions of the physics pro- 
cesses involved, an esencial master equation can be de- 
rived that completely describes the microscopic dynamic 
of the system M . An exact method to solve this master 
equation is the Monte Carlo (MC) method §, 0, §, §. 
In a Monte Carlo method a sequence of discrete events 
(reactions, including diffusion) is generated on a 2D lat- 
tice which represents the surface sites. These events are 
generated in general in a random way, looking for possi- 
ble enabled reactions between nearest neighbors on the 



'Electronic address: 
T Electronic address: 
* Electronic address: 



R.. Salazar@tue.nl 



;gtati@chcm, tuc.nl 



•cuzovkov@latnet . lv 



lattice, and doing the reactions with some probability in 
correspondence with some defined reaction rates. 

To compare MC simulations with experimental pat- 
tern formation it is necessary to fill the gap between the 
length scale of the individual particles and the diffusion 
length. The regime where spatio-temporal patterns usu- 
ally occurs, from /im to mm scales, is orders of magnitude 
larger than the nm scale of individual particles. However, 
some new experiments [|l0| |ll| show that the fast kinetics 
processes are typically accompanied by the appearance of 
nanostructures. Only microscopic simulations could deal 
with this two-scales behavior. However this regime im- 
plies lattices sizes above 10 6 x 10 6 and very large values 
for the diffusion rates to produce agreement with the ex- 
perimental observed scales on pattern formation. This 
would be a very large and slow simulation, due to the 
huge number of particles involved and the fast diffusion 
rates which means that most of the simulation time is 
spend doing diffusion of particles instead of chemical re- 
actions. Fortunately, we do not need to simulate every 
time such a macroscopic system or experimental diffu- 
sion rates. We only need to find scaling laws for lengths 
and diffusion coefficients, i.e from nm scale to /mi or mm 
scale, and from microscopic to real time. This is the 
possibility which we explore in this paper by using large 
simulations within parallel computers. 

Although only MC simulations provides solutions to 
the exact master equations for the surface reactions, they 
are not suitable for efficient parallelization due to the 
random selection of lattice sites used. However, there is 
another important approach to simulate discrete events 
on lattices, the Cellular Automata (CA) |l3j. This 
approach is fully parallel in the sense that all the lattice 
sites can be updated simultaneously. This has the ad- 
vantages that fewer random numbers are required, only 
those for the reaction probabilities (CA codes are faster 
than MC), and the global updating is easier to implement 
in a parallel code. 

Under some well-defined conditions a CA is equivalent 
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to a MC simulation (see ref. [H and section II). Then 
a CA could be used as a MC simulation and its paral- 
lelization will provide with the required scaling laws. In 
this paper we present for first time an attempt to provide 
a tool to get these scaling laws. The fact that CA are 
ideal simulation methods for parallelization is shown in 
a recent special issue dedicated to CA in Parallel Com- 
puting |lq| . In particular a quite interesting paper by 
J.R. Weimar [jl6| presents an object oriented parallel CA, 
also based in ref. |Q, and explores the possibility of di- 
vide the surface in regions to be simulated by RD or CA 
according to the level of detail required. 

In this paper we will describe in detail the implemen- 
tation of the parallel version of a CA simulating chem- 
ical reactions on surfaces, but the ideas discussed here 
are also applicable to more general CA simulations. In 
section II we will describe briefly the Cellular Automata 
method. In section III we explain the key points to im- 
plement the parallelization of the algorithm. In section 

IV we will present results of the application of the par- 
allel algorithm to the A+B^ model, and to a realistic 
model of oxidation of CO on Pt(llO). Finally in section 

V we will draw some conclusion. 



II. CELLULAR AUTOMATA 

A cellular automaton (CA) is a regular array of cells. 
Each cell can be in one of a set of possible states. The 
CA evolves in time in discrete steps by changing the 
states of all cells simultaneously. The next state which 
a cell will take is based on the previous state of the cell 
and the states of some neighboring cells. A CA is de- 
fined by providing prescriptions for the lattice, the set of 
states, the neighbors, and the transition rules jlj, [L3| . 
The idea of CA deserves by itself a lot of study, since 
in general the evolution of a CA cannot be predicted 
other than by executing it. Additionally the amount of 
possible CAs is quite large. For instance, in the simplest 
one-dimensional case, with two states and two neighbors, 
there are 256 possible CA. 

Our CA uses the standard square two-dimensional lat- 
tice of size L x L. The states of each cell represent 
occupation with some kind of particle. The Margolus 
neighborhood |l2| is used instead of the common von 
Neumann neighborhood. Both definitions of neighbor- 
hood are shown in Fig.|l]. In order to obey the CA laws 
in von Neumann neighborhood, it is necessary |l7f dis- 
obey the laws of stoichiometry because one particle could 
participate in more than one reactive pair. Similar prob- 
lems arise with the diffusion of particles Jlifl . The use 
of a Margolus neighborhood overcomes these difficulties 
p9[ pp[ pl| p2[ . Using the value of the chemical rates we 
set up some probabilistic transition rules to change the 
states of the cells inside each Margolus block. 

The Margolus blocks, Fig.|l|, are used in the following 
way. The blocks are periodically repeated to build up 
a tiled mask over the whole lattice, considering periodic 



FIG. 1: Von Neumann neighborhood (left) and Margolus 
neighborhood (right). The lines joining points show the near- 
est neighbor couples in each case. 



FIG. 2: The four possible tilings using Margolus blocks. 
The arrows show the sequence order. The lines joining points 
illustrate the MC updating of pairs inside blocks. 



boundary conditions. In this way there are four possible 
tilings as shown in Fig.||. Only neighbor sites belong- 
ing to the same block can react, so all the blocks can 
be accessed in parallel. Inside each block a Monte Carlo 
update is done. After a full lattice update the whole pro- 
cedure starts over again using another of the four possible 
tilings. The dynamics is not confined to blocks, because 
the boundary between blocks is changing from one global 
sweep to the next. 

For the four tilings shown in Fig.^|, we chose randomly 
one of the four possible sequences of tilings: (1,2,3,4), 
(2,3,4,1), (3,4,1,2), and (4,1,2,3). They show always the 
same clockwise cyclic sequence, but starting at a differ- 
ent tiling. This produces better boundary diffusion and 
mixing between blocks. 

A schematic description of the steps to implement this 
CA is given at the following: 

1. Choose randomly one of the four possible sequences 
of tilings: (1,2,3,4), (2,3,4,1), (3,4,1,2), (4,1,2,3). 

2. Choose consecutive tilings from the sequence of 
step 1. 

3. Sweep over all the blocks and in each block make a 
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single Monte Carlo update: 

3.1. Choose randomly one pair of neighbors. 

3.2. Choose a reaction i from the set of all the pos- 
sible reactions with a probability proportional 
to the reaction rate ki . 

3.3. Check if the reaction chosen in step 3.2 is pos- 
sible on the sites chosen in step 3.1. If it is 
possible do the reaction, otherwise do noth- 
ing. 

4. Increase the time by At/4, and return to step 2 
until the sequence of tilings is completed. 

5. Return to step 1. 

The updating scheme inside each block is the same as 
in the Dynamic Monte Carlo algorithm called Random 
Selection Method || Consequently, the time incre- 
ment is the same as one Monte Carlo Step (MCS) ||: 
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where the sum in the denominator corresponds to the 
sum of all the possible reaction rates. 

There is a large number of possible CA prescriptions 
which could try to reproduce a MC simulation of chem- 
ical reactions on surfaces. In fact, an extensive study 
made by J. Mai @ M, 121, " 



shows that in general it 



is difficult produce good CAs reproducing chemical reac- 
tions and diffusion. However, in a recent paper Kortlukc 
|jl4| studies under which conditions a CA could reproduce 
adequately a MC simulation of chemical reactions on sur- 
faces. He found that the main requirement is using large 
diffusion coefficients. It is required some sort of compro- 
mise between MC and CA: it is necessary use a regular 
array of blocks as in CA, and a MC update scheme has to 
be used inside each block. The CA which we use here is 
a small modification of that used by Kortliike originally 
fuj . We use blocks of 4-sites (Margolus blocks) instead 
of 2-sites (Hantels blocks) . The main advantage of using 
this 4-sites instead of 2-sites is that reduces the level of 
CA noise (see |lj]) inside each block. This CA noise is 
the difference between the diffusion simulated with CA 
and the correct diffusion simulated with MC. In fact us- 
ing larger blocks the noise will be smaller. 

Additionally we present here a full realization of the 
CA parallel computing idea by implementing this in a 
parallel code as is shown in the next section. This is 
the main reason for use CA instead MC as a solution 
for large-time and large-sizes simulations. Otherwise the 
only use of CA instead MC in a serial simulation does not 
speed up the simulation substantially, in the best cases 
only ~ 10-20%. This means that the implementation of 
an efficient CA is very important. 

Despite that we have to accept that our CA is an ap- 
proximation to exact MC microscopic simulations, we 
known when the approximation is good and we can al- 
ways check that the CA is reproducing the MC results. 



In this way we can think about the CA as the MC re- 
alization of the exact master equation for the chemical 
processes, getting in this way, the possibility of obtain 
the scaling laws mentioned at the introduction for these 
physical systems. In this paper we do not discus the 
problem of quality of the CA approximation to the MC 
realization. This was done in p4| and we have made for 
our modified CA a similar study with similar results. 

The fact that blocks can be updated without referring 
to other blocks allows us to do CA as a parallel algorithm, 
because the whole set of blocks can be updated at the 
same time. This is the subject of the next section. 



III. PARALLELIZATION 

The aim of the parallelization of any code is to dis- 
tribute the whole simulation over several computer pro- 
cessors, also called nodes. We define speed(P) = l/Tp, 
where Tp is the computer time used by P-processors to 
complete a simulation. The optimum result is that this 
multiplies the speed of the simulation with the number 
of nodes, speed(P) = P x speed(l). However, the usual 
result of parallelize a code is not the optimum. The key 
point to achieve a good speed up is that the time spent 
by each node sending and receiving information from/to 
the other nodes is small in comparison with the time con- 
sumed doing computing in each node. 

In order to distribute the work, we use here a geo- 
metrical division as usual [fl5| in spatial extended sys- 
tems, i.e., the full lattice is divide in sublattices of equal 
area. The time spent sending data is proportional to the 
length of the borders, ~ L, and the time doing comput- 
ing is proportional to the system size ~ L 2 . We divide 
the lattice in strips as is shown in Fig.[|. Considering 
the global periodic boundary conditions, each node has 
periodicity in one direction and in the other directions 
has to share information with only two neighbor nodes. 
Another possibility is divide the lattice in squares. This 
choice is less convenient. Each node has to share informa- 
tion with 4 nodes instead 2, and it is only possible to use 
a perfect square number of nodes P = 4, 9, 16, .... Using 
the strips sublattices requires sending a factor 4/ \/~P less 
data than using the square sublattices (2^], provided to 
have P < 16. 

Note that due to the periodicity in the vertical direc- 
tion inside each node, it is not necessary to interchange 
border information between nodes when we pass from the 
tiling 1 — > 2 and 3^4. It is only necessary when we pass 
from 2—>3 and 4 — > 1. This reduces in half the amount 
of communication data. From Fig.[| we see the direction 
in which the first column of data in each sublattice has 
to be be sent and received in each node. In the change 
of tilings 2 — ► 3 the data goes from left to right and in 
4 — > 1 from right to left. Instead of shifting the data of 
each node horizontally we use the first column for send- 
ing to or receiving from other nodes. As a consequence, 
and to get interaction between sites in the same block, we 
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FIG. 3: Distribution of the lattice in strips sublattices for 
computing in each node. Division for each tiling shown in 
Fig.| 



consider the first column neighbor of the last one. For 
this Blocking CA, where the reactions are confined in- 
side the blocks, this produces an effective periodicity in 
the horizontal direction. This extra periodic boundary 
condition inside each sublattice, makes the final parallel 
code very similar to a full lattice implementation. 

We use the SPMD (single program, multiple data) 
model to implement the parallel simulation, in which ev- 
ery node executes the same code using different sublat- 
tice. Also, we use a main node, called node zero, dedi- 
cated additionally to distribute and collect the global in- 
formation needed for the input /output of the simulation 
and other global computing required. Every node exe- 
cutes the same code, and when some special code has to 
be executed for only some nodes, is necessary use a node 
identifier number p. From the global periodic boundary 
conditions and the shape of the sublattices, there is peri- 
odicity also in the sequence of nodes: the node left with 
respect to node p = is node p = P — 1 (P is the to- 
tal number of nodes) and the node right with respect to 
node p = P — 1 is node p = 0. 

In the following we present the CA code for a single 
node. 

1. p = 0: Chose randomly one of the four possi- 
ble tiling sequences: (1,2,3,4), (2,3,4,1), (3,4,1,2), 
(4,1,2,3). Send that choice to the other nodes. 

p > 0: Receive the information of which tiling se- 
quences has been chosen. 

2. Vp: Choose consecutive tilings from the sequence 
of step 1 . 

Nl. Vp: If the previous tiling was 1 or 2 and the new 
tiling is 3 or 4 then 

send the first column to the left node p — 1, 



receive the data from right node p + 1 and put it in 
the first column. 

N2. Vp: If the previous tiling was 3 or 4 and the new 
tiling is 1 or 2 then 

send the first column to the right node p + 1 
receive the data from left node p — 1 and put it in 
the first column. 

3. Vp: Sweep over all the blocks and in each block 
make a single Monte Carlo update. 

4. Vp: Increase the time by At/ 4, and return to step 
2 until the tiling sequence is completed. 

5. Vp: Return to step 1. 

We have modified step 1 and added two new steps Nl 
and N2, to interchange information between the nodes. 
The rest is basically the same as the single processor code, 
but applied to the respective sublattices for each node. 
In order to avoid undesired correlations between different 
sublattices, special attention should be given to a good 
random number generator, producing different sequence 
of random numbers within each node H] . 

For implementation of the interprocess communica- 
tion and synchronization, the message passing interface 
MPI library has been selected, because it provides source 
portability to different kind of computers. This library 
provide, amongst a set of specialized and complete com- 
munication routines, a so called six-basic set of routines 
for interprocessor communication: initialization, termi- 
nation, getting the set of nodes, getting its own node 
number, sending data to other nodes, and receiving data 
from other nodes. 

There are computers with connection between nodes 
using giga-ethernet or several processors inside the same 
machine, sharing the same memory. These comput- 
ers represent optimal environments to test parallel al- 
gorithms, i.e. high-end supercomputers like Cray T3E, 
and middle-end supercomputers like Silicon Origin 2000. 
However, we will show in the next section that the re- 
sults of running this parallel CA algorithm in a low-end 
Beowulf cluster of PCs connected only via fast-ethernet 
produce already almost the ideal speed up. 



IV. RESULTS 

The parallel algorithm was tested on our local clus- 
ter of PCs, (17 Athlon l.lGhz/256Mb, fast-ethernet, 
Linux 2.4.18, MPICH 1.1.2). From the results we see 
that the improvement of the performance using P pro- 
cessors with respect to a single processor is almost the 
ideal, speed(P) = Pxspeed(l). In order to test the speed 
up of the algorithm, we will use two systems from surface 
catalysis: the A+B^ model and a model for CO oxida- 
tion on Pt(110) surfac e ]25||, which has produced several 
important results pi |27L &8l M. 
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FIG. 4: Temporal behavior of the concentration of particles 
for the A+B^ model starting with randomly mixed A and 
B. The solid line is the result from the simulation using 16 
processors and the dots are from using a single processor. 
The f -1 / 2 curve shows the asymptotic behavior. The two 
snapshots show the system at different times. 

The A+B^ model has been studied for a long time 
p0| , pl| |32| |35| , |34| , |35[ | In the pioneering analitycal paper 
by Ovchinnikov and Zcldovich | f30| it was shown for first 
time that the kinetic law of mass action is violated in this 
model, producing incorrect results when standard chem- 
ical kinetics is used. An illustrative case, also used in our 
paper, is a situation with equal concentrations of both 
reactants Ca = Cb = C, where the standard kinetics 
predicts an asymptotic behavior C oc t . This predic- 
tion correspond to the mean-field approximation and it 
is only valid for high dimensional systems |30|, V >4. In 
the low dimensional systems T> <4 with diffusion con- 
trolled processes it has been proved, using renormaliza- 
tion group arguments [p6| , that the correct asymptotic 
behavior is: C oc £ _x> / . A qualitative agreement with 
this behavior was shown for first time using MC simula- 
tions in reference J57|. 

The asymptotic law needs large simulation time t max . 
The diffusion length = \Hjt defines the pattern for- 
mation scale. A simulation until t = t max needs a lat- 
tice length L £(t max ), which correspond to a large 
simulation time of the order of t max L 2 ~ t 2 max - This 
case provides a good example of large-time and large- 
size systems with pattern formation to test our parallel 
algorithm described in the previous section. 

In the corresponding lattice model we consider two 
kinds of particles A and B. The only possible chemical 
reaction is desorption of AB, which happen when two 
particles A and B are next to each other, creating two 
empty sites. This process occurs with a rate constant k. 
Additionally the particles are allowed to diffuse with rate 
D. This happen when a particle is next to an empty site. 
We simulate the behavior produced for an initial condi- 
tion without empty sites and where the same number of 
A and B particles are initially randomly distributed on 
the surface: N A = N B = L 2 /2. 



500 1000 1500 

FIG. 5: The radial correlation function for the A+B^ 
model The same times as the snapshots in Figjij. The lines 
are the 16 processor results and the dots the single processor 
results. The dashed lines are fits with exp [-(r/r c ) 2 ]. The 
respective values r c (t) are plotted in the insert. 

In Fig.[| we present the temporal behavior and also 
illustrate the segregation process forming regions with 
high concentration of A or B, which increase in size 
with time. Also we show how the global concentration, 
C = Na/L 2 = Nb/L 2 , diminishes with time following 
the asymptotic power-law C ~ t~2. The system size 
used is L = 8192 and the parameters are k = D = 1. 
We present two sets of data in Figji[ The points cor- 
respond to a single processor simulation, and the solid 
line corresponds to a parallel simulation using 16 proces- 
sors. Both simulations start with identical random initial 
distribution of particles. It is noticeable that this initial 
distribution almost completely determines the following 
behavior of the system. The snapshots shown as insert 
in Fig.|| are from the parallel simulation. They are very 
similar to the ones obtained from the single processor 
simulation, which uses the same initial conditions but a 
different sequence of random numbers to simulate the dy- 
namics. Moreover, in order to check quantitatively the 
agreement between spatial structures between the single 
and parallel simulations, we present in FigJE] the radial 
correlation function. Again the points correspond to the 
single processor case and the solid line to the parallel 
case. A correlation length r c could be obtained by fitting 
these correlation functions with exp[— (r/r c ) 2 ], which we 
show in Fig.|| by dashed lines. The obtained values of r c 
are plotted in the insert. This shows that the correlation 
length for this dynamic also follows a power law r c ~ t a . 
By numerical fitting we obtain a = 0.5122 ± 0.012. An 
analytical asymptotical solution for this correlation func- 
tions is given in exp(— r 2 /4Z)i) or exp[— c(r / t;(t)) 2 ]. 
The previous value obtained for a means that the diffu- 
sion length £ (i) = \f~Dt define the scale of pattern forma- 
tion for this reaction model. 

The performance or speed up of the parallel algorithm 
for the A+B^ model is shown in Figj6[ using different 
system sizes L = 256,512,1024,4096,8192 and number 
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FIG. 6: The speed of the simulation using P processors 
normalized to the speed of one single processor for the A+B^ 
model. Different system sizes L = 256, 512, 1024, 4096, 8192 
and number of processors P = 1, 2, 4, 8, 16. In the insert the 
behavior of the speed of one single processor. 



of processors P — 1,2,4,8,16. The simulated time for 
each computation was set to t max = 500. The speed was 
normalized to the speed of the single processor case. In 
the insert we can see the behavior of the single processor 
speed for each system sizes. The advantage of use a large 
number of processors increases when the system size in- 
creases, as we expect from the discussion in the previous 
section. 

The second model used to test the parallel algorithm 
is a model for CO oxidation on Pt(100) and Pt(110) sur- 
faces ||, |6[ |7|, ||, !| . This system shows different 
types of kinetic oscillations. On Pt(100) local, irregular 



oscillations occur in a wide parameter interval, whereas 
on Pt(110) globally synchronized oscillations exist only in 
a very narrow parameter interval. Both surfaces exhibit 
an a ^ (3 surface reconstruction, where a denotes the hex 
or 1 x 2 phase on Pt(100) or Pt(110), respectively. (3 de- 
notes the unreconstructed lxl phase in both cases. Both 
surfaces have qualitatively quite similar properties with 
the exception of the dissociative adsorption of O2. The 
ratio of the sticking coefficients of O2 on the two phases 
is s a : S0 « 0.5 : 1 for Pt(110) and s a : sp ps 10~ 2 : 1 for 
Pt(100) From the experiments Q it is known that 
kinetic oscillations are closely connected with the a ^ (3 
reconstruction of the Pt surfaces. 

In the model [^5], |27], |28|, p9fl , CO is able to absorb 
onto a free surface site with rate constant y and to desorb 
from the surface with rate constant fc, independent of 
the surface phase to which the site belongs. O2 adsorbs 
dissociatively onto two nearest neighbor sites with rate 
constant (1 — y)s x with x — a $- In addition CO is able 
to diffuse via hopping onto a vacant nearest neighbor 
site with rate constant D. The CO+O reaction occurs 
with rate constant R, when CO and O arc in nearest 
neighbor sites desorbing the reaction product CO2. The 
a ^ (3 surface phase transition is modeled as a linear 
front propagation induced by the presence of CO in the 



border between phases with rate constant V. Consider 
two nearest neighbor surface sites in the state a/3. The 
transition a/3 — ► aa (a/3 — ► f3f3) occurs if none (at least 
one) of these two sites is occupied by CO. Summarizing 
the above transition definitions written in the more usual 
form of reaction equations gives: 



CO(g) + s* ^ 
2 (g) + 25 a - 
2 (g) + 2^- 
CO(a) + S x -» 
CO(a) + O(a) 



CO (a), 

20(a), 

20(a), 

S x + CO(a), 

^C0 2 (g) + 2S*, 



where S stands for a free adsorption site, \ stands for 
either a or (3 and (a) or (g) for a particle adsorbed on the 
surface or in thegas phase, respectively. For additional 
details see ref. f25| . 

Amongst several successful results of this model, we 
can mention that it was one of the first microscopic mod- 
els for CO oxidation on Pt including surface reconstruc- 
tion, which is nowadays widely accepted as the key el- 
ement in order to get oscillatory behavior. This model 
reproduce correctly oscillatory regimes for both surfaces 
Pt(100) and Pt(llO), by changing only one parameter 
s a . The diffusion of CO is consider explicitly and could 
be applied to the fast diffusion regime without modifi- 
cation. With this model an alternative mechanism for 
global synchronization of oscillation has been suggested 
p7[ , different from the traditional gas-phase coupling. 
This new mechanism is stochastic resonance, obtained by 
including a spontaneous nucleation of one surface phase 
in the other a ^ [3 at very low rates. One unique result 
reproducing experimental observations is the transition 
into chaotic behavior via the Feigenbaum route or period 
doubling ^6|. It is also for this model that the compat- 
ibility of both microscopic simulations MC and CA has 
been study extensively |14[ . 

In Fig.M we show snapshots from a simulation on 
Pt(llO) and a system size L — 8192 using the param- 
eter values, D = 250, V = 1, y = 0.494, k = 0.1, R = D. 
In the left part we plot the chemical species: CO parti- 
cles are dark-grey, O particles are light-grey, and empty 
sites are black. The right part shows the structure of 
the surface: a phase sites are black, and (3 phase sites 
are white. The pattern formation in this regime shows a 
spatio-temporal behavior, where a spiral dynamics is the 
dominant phenomena. It is interesting see the different 
structures at different spatial scales. For this purpose 
we include in Fig.|?] sections from the upper-left corner 
with sizes 4096 x 4096, 1024 x 1024, 256 x 256, from top 
to bottom. This sequence shows that the spiral dynam- 
ics occurs on a slowly varying island structure of sizes 
of the order of y/D/V. The fact that we can see both 
mesoscopic and microscopic pattern formation is a quite 
interesting feature of the model, which has some exper- 
imental evidence jlO[ [ll[] and has been studied theoret- 
ically in p8[ by including lateral interactions between 
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FIG. 7: Sequence of snapshots of the model of CO oxidation 
on Pt(llO). The left part shows the chemical species: CO 
particles are dark-grey, O particles are light-grey, and empty 
sites are black. The right part shows the structure of the 
surface: a phase sites are black, and f3 phase sites are white. 
The parameters are L = 8192, D = 250, and V = 1, y = 
0.494, k = 0.1, R = D. From top to bottom, we show sections 
from the upper-left corner with sizes 4096 x 4096, 1024 x 1024, 
256 x 256. 

adsorbed particles. The model used here is simpler, be- 
cause does not need that consideration in order to obtain 
nanostructures . 



In Fig.[| we analyze the speed up of the parallel al- 
gorithm of this realistic model. We use system sizes, 
L = 256, 512, 1024, 4096, 8192 and number of nodes P = 
1,2,4,8,16. The simulated time for each computation 




FIG. 8: The same as Fig.|6j, but for the model of CO oxidation 
on Pt(110). 

was also set to t max = 500. Here we can see that the 
speed up of the parallel algorithm is good even for small 
system sizes. This is because the amount of computing in 
each node for this model is larger than for the A+B^ 
model, while the amount of communication data is the 
same in both models. 



V. CONCLUSIONS 

In this paper we present a tool to obtain scaling laws 
connecting experimental system sizes and diffusion coeffi- 
cients to standard values in microscopic MC simulations. 
By using a CA equivalent to MC simulation we provide 
an efficient parallelization algorithm. We have explained 
in detail how to implement the parallelization. The speed 
up of the algorithm is almost ideal and it is much better 
for larger system sizes and more complex models. A full 
description and analysis of the scaling laws for the second 
model used here is in preparation ]39| . 
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